Automatic Story Segmentation for TV News Video Using Multiple Modalities

نویسندگان

  • Emilie Dumont
  • Georges Quénot
چکیده

While video content is often stored in rather large files or broadcasted in continuous streams, users are often interested in retrieving only a particular passage on a topic of interest to them. It is, therefore, necessary to split video documents or streams into shorter segments corresponding to appropriate retrieval units. We propose here a method for the automatic segmentation of TV news videos into stories. A-multiple-descriptor based segmentation approach is proposed. The selected multimodal features are complementary and give good insights about story boundaries. Once extracted, these features are expanded with a local temporal context and combined by an early fusion process. The story boundaries are then predicted using machine learning techniques. We investigate the system by experiments conducted using TRECVID 2003 data and protocol of the story boundary detection task, and we show that the proposed approach outperforms the state-of-the-art methods while requiring a very small amount of manual annotation.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic Story Segmentation for Spoken Document Retrieval

We have been working on speech retrieval based on Cantonese television news programs. Our video archive contains over 20 hours of news programs provided by a local television station. These programs have been hand-segmented into video clips, where each clip is a self-contained news story. The audio tracks in our archive are indexed by Cantonese speech recognition. This is integrated with a vect...

متن کامل

Discovery and fusion of salient multimodal features toward news story segmentation

In this paper, we present our new results in news video story segmentation and classification in the context of TRECVID video retrieval benchmarking event 2003. We applied and extended the Maximum Entropy statistical model to effectively fuse diverse features from multiple levels and modalities, including visual, audio, and text. We have included various features such as motion, face, music/spe...

متن کامل

Discovery and Fusion of Salient Multi-modal Features towards News Story Segmentation

In this paper, we present our new results in news video story segmentation and classification in the context of TRECVID video retrieval benchmarking event 2003. We applied and extended the Maximum Entropy statistical model to effectively fuse diverse features from multiple levels and modalities, including visual, audio, and text. We have included various features such as motion, face, music/spe...

متن کامل

News Video Story Segmentation Using Fusion of Multi-level Multi-modal Features in Trecvid

In this paper, we present our new results in news video story segmentation and classification in the context of TRECVID video retrieval benchmarking event 2003. We applied and extended the Maximum Entropy statistical model to effectively fuse diverse features from multiple levels and modalities, including visual, audio, and text. We have included various features such as motion, face, music/spe...

متن کامل

Automatic Segmentation, Aggregation and Indexing of Multimodal News Information from Television and the Internet

The global diffusion of the Internet has enabled the distribution of informative content through dynamic media such as RSS feeds and video blogs. At the same time, the decreasing cost of electronic devices has increased the pervasive availability of the same informative content in the form of digital audiovisual data. This article presents a system for the large-scale unsupervised acquisition, ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Int. J. Digital Multimedia Broadcasting

دوره 2012  شماره 

صفحات  -

تاریخ انتشار 2012